AITopics

Country:

North America > Canada (0.46)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Retail (0.40)
Social Sector (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsMar-22-2026, 05:48:01 GMT

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

Rapid model validation via the train-test paradigm has been a key driver for the breathtaking progress in machine learning and AI. However, modern AI systems often depend on a combination of tasks and data collection practices that violate all assumptions ensuring test validity. Yet, without rigorous model validation we cannot ensure the intended outcomes of deployed AI systems, including positive social impact, nor continue to advance AI research in a scientifically sound way. In this paper, I will show that for widely considered inference settings in complex social systems the train-test paradigm does not only lack a justification but is indeed invalid for any risk estimator, including counterfactual and causal estimators, with high probability. These formal impossibility results highlight a fundamental epistemic issue, i.e., that for key tasks in modern AI we cannot know whether models are valid under current data collection practices. Importantly, this includes variants of both recommender systems and reasoning via large language models, and neither naïve scaling nor limited benchmarks are suited to address this issue. I am illustrating these results via the widely used MovieLens benchmark and conclude by discussing the implications of these results for AI in social systems, including possible remedies such as participatory data curation and open science.

artificial intelligence, machine learning, proceedings, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Neural Information Processing SystemsFeb-17-2026, 18:03:25 GMT

b97fc02c9e536d68300d82be05c23aa2-Paper-Conference.pdf

large language model, machine learning, natural language, (24 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.93)
Information Technology > Data Science > Data Quality (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Neural Information Processing SystemsMay-27-2025, 14:17:14 GMT

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

artificial intelligence, complex social system, machine learning, (7 more...)

Industry: Retail (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMar-19-2025

Model Risk Management for Generative AI In Financial Institutions

Bhattacharyya, Anwesha, Yu, Ye, Yang, Hanyu, Singh, Rahul, Joshi, Tarun, Chen, Jie, Yalavarthy, Kiran

The success of OpenAI's ChatGPT in 2023 has spurred financial enterprises into exploring Generative AI applications to reduce costs or drive revenue within different lines of businesses in the Financial Industry. While these applications offer strong potential for efficiencies, they introduce new model risks, primarily hallucinations and toxicity. As highly regulated entities, financial enterprises (primarily large US banks) are obligated to enhance their model risk framework with additional testing and controls to ensure safe deployment of such applications. This paper outlines the key aspects for model risk management of generative AI model with a special emphasis on additional practices required in model validation.

large language model, machine learning, natural language, (17 more...)

2503.15668

Country: North America > United States (0.69)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Government > Regional Government > North America Government > United States Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

arXiv.org Machine LearningNov-20-2024

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

Nickel, Maximilian

Rapid model validation via the train-test paradigm has been a key driver for the breathtaking progress in machine learning and AI. However, modern AI systems often depend on a combination of tasks and data collection practices that violate all assumptions ensuring test validity. Yet, without rigorous model validation we cannot ensure the intended outcomes of deployed AI systems, including positive social impact, nor continue to advance AI research in a scientifically sound way. In this paper, I will show that for widely considered inference settings in complex social systems the train-test paradigm does not only lack a justification but is indeed invalid for any risk estimator, including counterfactual and causal estimators, with high probability. These formal impossibility results highlight a fundamental epistemic issue, i.e., that for key tasks in modern AI we cannot know whether models are valid under current data collection practices. Importantly, this includes variants of both recommender systems and reasoning via large language models, and neither na\"ive scaling nor limited benchmarks are suited to address this issue. I am illustrating these results via the widely used MovieLens benchmark and conclude by discussing the implications of these results for AI in social systems, including possible remedies such as participatory data curation and open science.

possible world, social system, validity, (17 more...)

arXiv.org Machine Learning

2411.13653

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Retail (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceNov-6-2023

InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates

Huang, Jinbin, He, Wenbin, Gou, Liang, Ren, Liu, Bryan, Chris

Deep learning models are widely used in critical applications, highlighting the need for pre-deployment model understanding and improvement. Visual concept-based methods, while increasingly used for this purpose, face challenges: (1) most concepts lack interpretability, (2) existing methods require model knowledge, often unavailable at run time. Additionally, (3) there lacks a no-code method for post-understanding model improvement. Addressing these, we present InterVLS. The system facilitates model understanding by discovering text-aligned concepts, measuring their influence with model-agnostic linear surrogates. Employing visual analytics, InterVLS offers concept-based explanations and performance insights. It enables users to adjust concept influences to update a model, facilitating no-code model improvement. We evaluate InterVLS in a user study, illustrating its functionality with two scenarios. Results indicates that InterVLS is effective to help users identify influential concepts to a model, gain insights and adjust concept influence to improve the model. We conclude with a discussion based on our study results.

explanation, intervls, surrogate model, (14 more...)

2311.03547

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Arizona (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry:

Transportation (0.46)
Information Technology (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Gosar, Viral, Alirezaei, Mohsen, Besselink, Igo, Nijmeijer, Henk

Model Validation of a Low-Speed and Reverse Driving Articulated Vehicle

arXiv.org Artificial IntelligenceOct-1-2023

For the autonomous operation of articulated vehicles at distribution centers, accurate positioning of the vehicle is of the utmost importance. Automation of these vehicle poses several challenges, e.g. large swept path, asymmetric steering response, large slide slip angles of non-steered trailer axles and trailer instability while reversing. Therefore, a validated vehicle model is required that accurately and efficiently predicts the states of the vehicle. Unlike forward driving, open-loop validation methods can not be used for reverse driving of articulated vehicles due to their unstable dynamics. This paper proposes an approach to stabilize the unstable pole of the system and compares three vehicle models (kinematic, non-linear single track and multibody dynamics model) against real-world test data obtained from low-speed experiments at a distribution center. It is concluded that single track non-linear model has a better performance in comparison to other models for large articulation angles and reverse driving maneuvers.

articulation angle, maneuver, vehicle, (13 more...)

2310.00691

Country:

Asia > Japan (0.06)
Europe > Netherlands > North Brabant > Eindhoven (0.05)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report (0.64)

Industry:

Automobiles & Trucks (1.00)
Transportation > Freight & Logistics Services (0.89)
Transportation > Ground > Road (0.48)

Technology: Information Technology > Artificial Intelligence (1.00)

arXiv.org Artificial IntelligenceMay-2-2023

Efficient Federated Learning with Enhanced Privacy via Lottery Ticket Pruning in Edge Computing

Shi, Yifan, Wei, Kang, Shen, Li, Li, Jun, Wang, Xueqian, Yuan, Bo, Guo, Song

Federated learning (FL) is a collaborative learning paradigm for decentralized private data from mobile terminals (MTs). However, it suffers from issues in terms of communication, resource of MTs, and privacy. Existing privacy-preserving FL methods usually adopt the instance-level differential privacy (DP), which provides a rigorous privacy guarantee but with several bottlenecks: severe performance degradation, transmission overhead, and resource constraints of edge devices such as MTs. To overcome these drawbacks, we propose Fed-LTP, an efficient and privacy-enhanced FL framework with \underline{\textbf{L}}ottery \underline{\textbf{T}}icket \underline{\textbf{H}}ypothesis (LTH) and zero-concentrated D\underline{\textbf{P}} (zCDP). It generates a pruned global model on the server side and conducts sparse-to-sparse training from scratch with zCDP on the client side. On the server side, two pruning schemes are proposed: (i) the weight-based pruning (LTH) determines the pruned global model structure; (ii) the iterative pruning further shrinks the size of the pruned model's parameters. Meanwhile, the performance of Fed-LTP is also boosted via model validation based on the Laplace mechanism. On the client side, we use sparse-to-sparse training to solve the resource-constraints issue and provide tighter privacy analysis to reduce the privacy budget. We evaluate the effectiveness of Fed-LTP on several real-world datasets in both independent and identically distributed (IID) and non-IID settings. The results clearly confirm the superiority of Fed-LTP over state-of-the-art (SOTA) methods in communication, computation, and memory efficiencies while realizing a better utility-privacy trade-off.

artificial intelligence, fed-ltp, machine learning, (14 more...)

2305.01387

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre:

Research Report (0.82)
Contests & Prizes (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Macocco, Iuri, Glielmo, Aldo, Grilli, Jacopo, Laio, Alessandro

Intrinsic dimension estimation for discrete metrics

arXiv.org Artificial IntelligenceMar-12-2023

Real world-datasets characterized by discrete features are ubiquitous: from categorical surveys to clinical questionnaires, from unweighted networks to DNA sequences. Nevertheless, the most common unsupervised dimensional reduction methods are designed for continuous spaces, and their use for discrete spaces can lead to errors and biases. In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces. We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting, finding a surprisingly small ID, of order 2. This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.

artificial intelligence, bayesian inference, machine learning, (17 more...)

doi: 10.1103/PhysRevLett.130.067401

2207.09688

Country: Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)